249 research outputs found

    Poly-Sarcosine and Poly(ethylene-glycol) interactions with proteins investigated using molecular dynamics simulations

    Get PDF
    Nanoparticles coated with hydrophilic polymers often show a reduction in unspecific interactions with the biological environment, which improves their biocompatibility. The molecular determinants of this reduction are not very well understood yet, and their knowledge may help improving nanoparticle design. Here we address, using molecular dynamics simulations, the interactions of human serum albumin, the most abundant serum protein, with two promising hydrophilic polymers used for the coating of therapeutic nanoparticles, poly(ethylene-glycol) and poly-sarcosine. By simulating the protein immersed in a polymer-water mixture, we show that the two polymers have a very similar affinity for the protein surface, both in terms of the amount of polymer adsorbed and also in terms of the type of amino acids mainly involved in the interactions. We further analyze the kinetics of adsorption and how it affects the polymer conformations. Minor differences between the polymers are observed in the thickness of the adsorption layer, that are related to the different degree of flexibility of the two molecules. In comparison poly-alanine, an isomer of poly-sarcosine known to self-aggregate and induce protein aggregation, shows a significantly larger affinity for the protein surface than PEG and PSar, which we show to be related not to a different patterns of interactions with the protein surface, but to the different way the polymer interacts with water

    Flexible domain prediction using mixed effects random forests

    Get PDF
    This paper promotes the use of random forests as versatile tools for estimating spatially disaggregated indicators in the presence of small area-specific sample sizes. Small area estimators are predominantly conceptualised within the regression-setting and rely on linear mixed models to account for the hierarchical structure of the survey data. In contrast, machine learning methods offer non-linear and non-parametric alternatives, combining excellent predictive performance and a reduced risk of model-misspecification. Mixed effects random forests combine advantages of regression forests with the ability to model hierarchical dependencies. This paper provides a coherent framework based on mixed effects random forests for estimating small area averages and proposes a non-parametric bootstrap estimator for assessing the uncertainty of the estimates. We illustrate advantages of our proposed methodology using Mexican income-data from the state Nuevo LeĂłn. Finally, the methodology is evaluated in model-based and design-based simulations comparing the proposed methodology to traditional regression-based approaches for estimating small area averages

    Small Area with Multiply Imputed Survey Data

    Get PDF
    In this article, we propose a framework for small area estimation with multiply imputed survey data. Many statistical surveys suffer from (a) high nonresponse rates due to sensitive questions and response burden and (b) too small sample sizes to allow for reliable estimates on (unplanned) disaggregated levels due to budget constraints. One way to deal with missing values is to replace them by several plausible/imputed values based on a model. Small area estimation, such as the model by Fay and Herriot, is applied to estimate regionally disaggregated indicators when direct estimates are imprecise. The framework presented tackles simultaneously multiply imputed values and imprecise direct estimates. In particular, we extend the general class of transformed Fay-Herriot models to account for the additional uncertainty from multiple imputation. We derive three special cases of the Fay-Herriot model with particular transformations and provide point and mean squared error estimators. Depending on the case, the mean squared error is estimated by analytic solutions or resampling methods. Comprehensive simulations in a controlled environment show that the proposed methodology leads to reliable and precise results in terms of bias and mean squared error. The methodology is illustrated by a real data example using European wealth data

    Modelling the distribution of health related quality of life of advancedmelanoma patients in a longitudinal multi-centre clinical trial using M-quantile random effects regression

    Get PDF
    Health-related quality of life assessment is important in the clinical evaluation of patients with metastatic disease that may offer useful information in understanding the clinical effectiveness of a treatment. To assess if a set of explicative variables impacts on the health-related quality of life, regression models are routinely adopted. However, the interest of researchers may be focussed on modelling other parts (e.g. quantiles) of this conditional distribution. In this paper, we present an approach based on quantile and M-quantile regression to achieve this goal. We applied the methodologies to a prospective, randomized, multi-centre clinical trial. In order to take into account the hierarchical nature of the data we extended the M-quantile regression model to a three-level random effects specification and estimated it by maximum likelihood

    Robust small area estimation under spatial non-stationarity

    Get PDF
    Geographically weighted small area methods have been studied in literature for small area estimation. Although these approaches are useful for the estimation of small area means efficiently under strict parametric assumptions, they can be very sensitive to outliers in the data. In this paper, we propose a robust extension of the geographically weighted empirical best linear unbiased predictor (GWEBLUP). In particular, we introduce robust projective and predictive small area estimators under spatial non-stationarity. Mean squared error estimation is performed by two different analytic approaches that account for the spatial structure in the data. The results from the model- based simulations indicate that the proposed approach may lead to gains in terms of efficiency. Finally, the methodology is demonstrated in an illustrative application for estimating the average total cash costs for farms in Australia

    Estimating regional income indicators under transformations and access to limited population auxiliary information

    Get PDF
    Spatially disaggregated income indicators are typically estimated by using model-based methods that assume access to auxiliary information from population micro-data. In many countries like Germany and the UK population micro-data are not publicly available. In this work we propose small area methodology when only aggregate population-level auxiliary information is available. We use data-driven transformations of the response to satisfy the parametric assumptions of the used models. In the absence of population micro-data, appropriate bias-corrections for small area prediction are needed. Under the approach we propose in this paper, aggregate statistics (means and covariances) and kernel density estimation are used to resolve the issue of not having access to population micro-data. We further explore the estimation of the mean squared error using the parametric bootstrap. Extensive model-based and design-based simulations are used to compare the proposed method to alternative methods. Finally, the proposed methodology is applied to the 2011 Socio-Economic Panel and aggregate census information from the same year to estimate the average income for 96 regional planning regions in Germany

    Releasing survey microdata with exact cluster locations and additional privacy safeguards

    Get PDF
    Household survey programs around the world publish fine-granular georeferenced microdata to support research on the interdependence of human livelihoods and their surrounding environment. To safeguard the respondents’ privacy, micro-level survey data is usually (pseudo)-anonymized through deletion or perturbation procedures such as obfuscating the true location of data collection. This, however, poses a challenge to emerging approaches that augment survey data with auxiliary information on a local level. Here, we propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards through synthetically generated data using generative models. We back our proposal with experiments using data from the 2011 Costa Rican census and satellite-derived auxiliary information. Our strategy reduces the respondents’ re-identification risk for any number of disclosed attributes by 60–80% even under re-identification attempts

    Estimation of Linear and Non-Linear Indicators using Interval Censored Income Data

    Get PDF
    Among a variety of small area estimation methods, one popular approach for the estimation of linear and non-linear indicators is the empirical best predictor. However, parameter estimation using standard maximum likelihood methods is not possible, when the dependent variable of the underlying nested error regression model, is censored to specific intervals. This is often the case for income variables. Therefore, this work proposes an estimation method, which enables the estimation of the regression parameters of the nested error regression model using interval censored data. The introduced method is based on the stochastic expectation maximization algorithm. Since the stochastic expectation maximization method relies on the Gaussian assumptions of the error terms, transformations are incorporated into the algorithm to handle departures from normality. The estimation of the mean squared error of the empirical best predictors is facilitated by a parametric bootstrap which captures the additional uncertainty coming from the interval censored dependent variable. The validity of the proposed method is validated by extensive model-based simulations

    estimating literacy rates in Senegal

    Get PDF
    Modern systems of official statistics require the accurate and timely estimation of socio-demographic indicators for disaggregated geographical regions. Traditional data collection methods such as censuses or household surveys impose great financial and organizational burdens for National Statistical Institutes. The rise of new information and communication technologies offers promising sources to mitigate these shortcomings. In this paper we propose a unified approach for National Statistical Institutes based on small area estimation that allows for the estimation of socio-demographic indicators by using mobile phone data. In particular, the methodology is applied to mobile phone data from Senegal for deriving sub-national estimates of the share of illiterates disaggregated by gender. The estimates are used to identify hot spots of illiterates with a need for additional infrastructure or policy adjustments. Although the paper focuses on literacy as a particular socio-demographic indicator, the proposed approach is applicable to indicators from national statistics in general
    • …
    corecore